Overview

Dataset Statistics

Number of Variables 6
Number of Rows 344404
Missing Cells 418
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 190.0 MB
Average Row Size in Memory 578.6 B
Variable Types
  • Numerical: 3
  • Categorical: 3

Dataset Insights

date has a high cardinality: 2696 distinct values High Cardinality
reviewer_name has a high cardinality: 41849 distinct values High Cardinality
comments has a high cardinality: 335723 distinct values High Cardinality
date has constant length 10 Constant Length

Variables

listing_id

numerical

Approximate Distinct Count 9759
Approximate Unique (%) 2.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 5.3 MB
Mean 1.1998e+07
Minimum 6
Maximum 2.9981e+07
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • listing_id is skewed right (γ1 = 0.133)

Quantile Statistics

Minimum 6
5-th Percentile 585469
Q1 5.1343e+06
Median 1.2347e+07
Q3 1.8431e+07
95-th Percentile 2.4581e+07
Maximum 2.9981e+07
Range 2.9981e+07
IQR 1.3296e+07

Descriptive Statistics

Mean 1.1998e+07
Standard Deviation 7.7579e+06
Variance 6.0185e+13
Sum 4.1322e+12
Skewness 0.133
Kurtosis -1.1195
Coefficient of Variation 0.6466

id

numerical

Approximate Distinct Count 344390
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 5.3 MB
Mean 1.9289e+08
Minimum 8
Maximum 3.5025e+08
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • id is skewed left (γ1 = -0.2358)

Quantile Statistics

Minimum 8
5-th Percentile 2.6583e+07
Q1 1.1421e+08
Median 2.0204e+08
Q3 2.7718e+08
95-th Percentile 3.353e+08
Maximum 3.5025e+08
Range 3.5025e+08
IQR 1.6297e+08

Descriptive Statistics

Mean 1.9289e+08
Standard Deviation 9.757e+07
Variance 9.5198e+15
Sum 6.6432e+13
Skewness -0.2358
Kurtosis -1.0963
Coefficient of Variation 0.5058

date

categorical

Approximate Distinct Count 2696
Approximate Unique (%) 0.8%
Missing 0
Missing (%) 0.0%
Memory Size 24.6 MB

Length

Mean 10
Standard Deviation 0
Median 10
Minimum 10
Maximum 10

Sample

1st row 2008-06-22
2nd row 2009-06-22
3rd row 2012-07-16
4th row 2013-07-07
5th row 2013-07-08

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 688808
Decimal Number 2755232
  • date contains many words: 2696 words
  • date has words of constant length

reviewer_id

numerical

Approximate Distinct Count 310747
Approximate Unique (%) 90.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 5.3 MB
Mean 7.0651e+07
Minimum 29
Maximum 2.2583e+08
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • reviewer_id is skewed right (γ1 = 0.7471)

Quantile Statistics

Minimum 29
5-th Percentile 2.9613e+06
Q1 2.1403e+07
Median 5.4897e+07
Q3 1.1158e+08
95-th Percentile 1.896e+08
Maximum 2.2583e+08
Range 2.2583e+08
IQR 9.0173e+07

Descriptive Statistics

Mean 7.0651e+07
Standard Deviation 5.8953e+07
Variance 3.4754e+15
Sum 2.4332e+13
Skewness 0.7471
Kurtosis -0.5399
Coefficient of Variation 0.8344
  • reviewer_id is not normally distributed (p-value 1.1925336390270796e-05)

reviewer_name

categorical

Approximate Distinct Count 41849
Approximate Unique (%) 12.2%
Missing 1
Missing (%) 0.0%
Memory Size 23.4 MB

Length

Mean 5.7944
Standard Deviation 2.0405
Median 6
Minimum 1
Maximum 35

Sample

1st row Terrence
2nd row Christine
3rd row Hieu
4th row Daryna
5th row Mathew

Letter

Count 1971039
Lowercase Letter 1614686
Space Separator 13227
Uppercase Letter 356353
Dash Punctuation 1295
Decimal Number 261
  • reviewer_name contains many words: 35485 words

comments

categorical

Approximate Distinct Count 335723
Approximate Unique (%) 97.6%
Missing 417
Missing (%) 0.1%
Memory Size 142.0 MB

Length

Mean 282.5017
Standard Deviation 268.3813
Median 238
Minimum 1
Maximum 5205

Sample

1st row Sara is an awesome...
2nd row My stay at “Crafts...
3rd row This was my first ...
4th row Sara was a very pl...
5th row Sara was such an a...

Letter

Count 76582593
Lowercase Letter 74187807
Space Separator 17329660
Uppercase Letter 2394786
Dash Punctuation 90084
Decimal Number 170183
  • comments contains many words: 125781 words

Interactions

Correlations

Missing Values